<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic
  PUBLIC "-//OASIS//DTD DITA Composite//EN" "ditabase.dtd">
<topic id="topic1">
    <title id="it20941">gpinitsystem</title>
    <body>
        <p>Initializes a Greenplum Database system using configuration parameters specified in the
                <codeph>gpinitsystem_config</codeph> file.</p>
        <section id="section2">
            <title>Synopsis</title>
            <codeblock>gpinitsystem -c &lt;cluster_configuration_file> 
            [-h &lt;hostfile_gpinitsystem>]
            [-B &lt;parallel_processes>] 
            [-p &lt;postgresql_conf_param_file>]
            [-s &lt;standby_master_host>
                [-P &lt;standby_master_port>]
                [-S &lt;standby_master_datadir> | --standby_datadir=&lt;standby_master_datadir>]]
            [--ignore-warnings]
            [-m &lt;number> | --max_connections=number>]
            [-b &lt;size> | --shared_buffers=&lt;size>]
            [-n &lt;locale> | --locale=&lt;locale>] [--lc-collate=&lt;locale>] 
            [--lc-ctype=&lt;locale>] [--lc-messages=&lt;locale>] 
            [--lc-monetary=&lt;locale>] [--lc-numeric=&lt;locale>] 
            [--lc-time=&lt;locale>] [-e &lt;password> | --su_password=&lt;password>] 
            [--mirror-mode={group|spread}] [-a] [-q] [-l &lt;logfile_directory>] [-D]
            [-I &lt;input_configuration_file>]
            [-O &lt;output_configuration_file>]

gpinitsystem -v | --version

gpinitsystem -? | --help</codeblock>
        </section>
        <section id="section3">
            <title>Description</title>
            <p>The <codeph>gpinitsystem</codeph> utility creates a Greenplum Database instance or
                writes an input configuration file using the values defined in a cluster
                configuration file and any command-line options that you provide. See <xref
                    href="#topic1/section5" type="section" format="dita"/> for more information
                about the configuration file. Before running this utility, make sure that you have
                installed the Greenplum Database software on all the hosts in the array.</p>
            <p>With the <codeph>&lt;-O output_configuration_file></codeph> option,
                    <codeph>gpinitsystem</codeph> writes all provided configuration information to
                the specified output file. This file can be used with the <codeph>-I</codeph> option
                to create a new cluster or re-create a cluster from a backed up configuration. See
                    <xref href="#topic1/section5" type="section" format="dita"/> for more
                information. </p>
            <p>In a Greenplum Database DBMS, each database instance (the master instance and all
                segment instances) must be initialized across all of the hosts in the system in such
                a way that they can all work together as a unified DBMS. The
                    <codeph>gpinitsystem</codeph> utility takes care of initializing the Greenplum
                master and each segment instance, and configuring the system as a whole.</p>
            <p>Before running <codeph>gpinitsystem</codeph>, you must set the
                    <codeph>$GPHOME</codeph> environment variable to point to the location of your
                Greenplum Database installation on the master host and exchange SSH keys between all
                host addresses in the array using <codeph>gpssh-exkeys</codeph>.</p>
            <p>This utility performs the following tasks:</p>
            <ul>
                <li id="it147947">Verifies that the parameters in the configuration file are
                    correct.</li>
                <li id="it147948">Ensures that a connection can be established to each host address.
                    If a host address cannot be reached, the utility will exit.</li>
                <li id="it147949">Verifies the locale settings.</li>
                <li id="it147950">Displays the configuration that will be used and prompts the user
                    for confirmation.</li>
                <li id="it147951">Initializes the master instance.</li>
                <li id="it147952">Initializes the standby master instance (if specified).</li>
                <li id="it147953">Initializes the primary segment instances.</li>
                <li id="it147954">Initializes the mirror segment instances (if mirroring is
                    configured).</li>
                <li id="it147957">Configures the Greenplum Database system and checks for
                    errors.</li>
                <li id="it147961">Starts the Greenplum Database system.</li>
            </ul>
            <note>This utility uses secure shell (SSH) connections between systems to perform its
                tasks. In large Greenplum Database deployments, cloud deployments, or deployments
                with a large number of segments per host, this utility may exceed the host's maximum
                threshold for unauthenticated connections. Consider updating the SSH
                    <codeph>MaxStartups</codeph> and <codeph>MaxSessions</codeph> configuration
                parameters to increase this threshold. For more information about SSH configuration
                options, refer to the SSH documentation for your Linux distribution.</note>
        </section>
        <section id="section4">
            <title>Options</title>
            <parml>
                <plentry>
                    <pt>-a</pt>
                    <pd>Do not prompt the user for confirmation.</pd>
                </plentry>
                <plentry>
                    <pt>-B <varname>parallel_processes</varname></pt>
                    <pd>The number of segments to create in parallel. If not specified, the utility
                        will start up to 4 parallel processes at a time.</pd>
                </plentry>
                <plentry>
                    <pt>-c <varname>cluster_configuration_file</varname></pt>
                    <pd>Required. The full path and filename of the configuration file, which
                        contains all of the defined parameters to configure and initialize a new
                        Greenplum Database system. See <xref href="#topic1/section5" type="section"
                            format="dita"/> for a description of this file. You must provide either
                        the <codeph>-c &lt;cluster_configuration_file></codeph> option or the
                            <codeph>-I &lt;input_configuration_file></codeph> option to
                            <codeph>gpinitsystem</codeph>. </pd>
                </plentry>
                <plentry>
                    <pt>-D</pt>
                    <pd>Sets log output level to debug.</pd>
                </plentry>
                <plentry>
                    <pt>-h <varname>hostfile_gpinitsystem</varname></pt>
                    <pd>Optional. The full path and filename of a file that contains the host
                        addresses of your segment hosts. If not specified on the command line, you
                        can specify the host file using the <codeph>MACHINE_LIST_FILE</codeph>
                        parameter in the <varname>gpinitsystem_config</varname> file.</pd>
                </plentry>
                <plentry>
                    <pt>-I <varname>input_configuration_file</varname></pt>
                    <pd>The full path and filename of an input configuration file, which defines the
                        Greenplum Database host systems, the master instance and segment instances
                        on the hosts, using the <codeph>QD_PRIMARY_ARRAY</codeph>,
                            <codeph>PRIMARY_ARRAY</codeph>, and <codeph>MIRROR_ARRAY</codeph>
                        parameters. The input configuration file is typically created by using
                            <codeph>gpinitsystem</codeph> with the <codeph>-O
                                <varname>output_configuration_file</varname></codeph> option. Edit
                        those parameters in order to initialize a new cluster or re-create a cluster
                        from a backed up configuration. You must provide either the <codeph>-c
                            &lt;cluster_configuration_file></codeph> option or the <codeph>-I
                            &lt;input_configuration_file></codeph> option to
                            <codeph>gpinitsystem</codeph>. </pd>
                </plentry>
                <plentry>
                    <pt>--ignore-warnings</pt>
                    <pd>Controls the value returned by <codeph>gpinitsystem</codeph> when warnings
                        or an error occurs. The utility returns 0 if system initialization completes
                        without warnings. If only warnings occur, system initialization completes
                        and the system is operational.</pd>
                    <pd>With this option, <codeph>gpinitsystem</codeph> also returns 0 if warnings
                        occurred during system initialization, and returns a non-zero value if a
                        fatal error occurs.</pd>
                    <pd>If this option is not specified, <codeph>gpinitsystem</codeph> returns 1 if
                        initialization completes with warnings, and returns value of 2 or greater if
                        a fatal error occurs.</pd>
                    <pd> See the <codeph>gpinitsystem</codeph> log file for warning and error
                        messages.</pd>
                </plentry>
                <plentry>
                    <pt> -n <varname>locale</varname> | --locale=<varname>locale</varname>
                    </pt>
                    <pd>Sets the default locale used by Greenplum Database. If not specified, the 
                        default locale is <codeph>en_US.utf8</codeph>. A locale identifier consists of a language
                        identifier and a region identifier, and optionally a character set encoding.
                        For example, <codeph>sv_SE</codeph> is Swedish as spoken in Sweden,
                            <codeph>en_US</codeph> is U.S. English, and <codeph>fr_CA</codeph> is
                        French Canadian. If more than one character set can be useful for a locale,
                        then the specifications look like this: <codeph>en_US.UTF-8</codeph> (locale
                        specification and character set encoding). On most systems, the command
                            <codeph>locale</codeph> will show the locale environment settings and
                            <codeph>locale -a</codeph> will show a list of all available
                        locales.</pd>
                </plentry>
                <plentry>
                    <pt>--lc-collate=<varname>locale</varname></pt>
                    <pd>Similar to <codeph>--locale</codeph>, but sets the locale used for collation
                        (sorting data). The sort order cannot be changed after Greenplum Database is
                        initialized, so it is important to choose a collation locale that is
                        compatible with the character set encodings that you plan to use for your
                        data. There is a special collation name of <codeph>C</codeph> or
                            <codeph>POSIX</codeph> (byte-order sorting as opposed to
                        dictionary-order sorting). The <codeph>C</codeph> collation can be used with
                        any character encoding.</pd>
                </plentry>
                <plentry>
                    <pt>--lc-ctype=<varname>locale</varname></pt>
                    <pd>Similar to <codeph>--locale</codeph>, but sets the locale used for character
                        classification (what character sequences are valid and how they are
                        interpreted). This cannot be changed after Greenplum Database is
                        initialized, so it is important to choose a character classification locale
                        that is compatible with the data you plan to store in Greenplum
                        Database.</pd>
                </plentry>
                <plentry>
                    <pt>--lc-messages=<varname>locale</varname></pt>
                    <pd>Similar to <codeph>--locale</codeph>, but sets the locale used for messages
                        output by Greenplum Database. The current version of Greenplum Database does
                        not support multiple locales for output messages (all messages are in
                        English), so changing this setting will not have any effect.</pd>
                </plentry>
                <plentry>
                    <pt>--lc-monetary=<varname>locale</varname></pt>
                    <pd>Similar to <codeph>--locale</codeph>, but sets the locale used for
                        formatting currency amounts.</pd>
                </plentry>
                <plentry>
                    <pt>--lc-numeric=<varname>locale</varname></pt>
                    <pd>Similar to <codeph>--locale</codeph>, but sets the locale used for
                        formatting numbers.</pd>
                </plentry>
                <plentry>
                    <pt>--lc-time=<varname>locale</varname></pt>
                    <pd>Similar to <codeph>--locale</codeph>, but sets the locale used for
                        formatting dates and times.</pd>
                </plentry>
                <plentry>
                    <pt>-l <varname>logfile_directory</varname></pt>
                    <pd>The directory to write the log file. Defaults to
                            <codeph>~/gpAdminLogs</codeph>.</pd>
                </plentry>
                <plentry>
                    <pt> -m <varname>number</varname> | --max_connections=number</pt>
                    <pd>Sets the maximum number of client connections allowed to the master. The
                        default is 250.</pd>
                </plentry>
                <plentry id="output_config_file">
                    <pt>-O <varname>output_configuration_file</varname></pt>
                    <pd>Optional, used during new cluster initialization. This option writes the
                            <varname>cluster_configuration_file</varname> information (used with -c)
                        to the specified <varname>output_configuration_file</varname>. This file
                        defines the Greenplum Database members using the
                            <codeph>QD_PRIMARY_ARRAY</codeph>, <codeph>PRIMARY_ARRAY</codeph>, and
                            <codeph>MIRROR_ARRAY</codeph> parameters. Use this file as a template
                        for the <codeph>-I </codeph><varname>input_configuration_file</varname>
                        option. See <xref href="#topic1/section6" format="dita">Examples</xref> for
                        more information. </pd>
                </plentry>
                <plentry>
                    <pt>-p <varname>postgresql_conf_param_file</varname></pt>
                    <pd>Optional. The name of a file that contains <codeph>postgresql.conf</codeph>
                        parameter settings that you want to set for Greenplum Database. These
                        settings will be used when the individual master and segment instances are
                        initialized. You can also set parameters after initialization using the
                            <codeph>gpconfig</codeph> utility.</pd>
                </plentry>
                <plentry>
                    <pt>-q</pt>
                    <pd>Run in quiet mode. Command output is not displayed on the screen, but is
                        still written to the log file.</pd>
                </plentry>
                <plentry>
                    <pt> -b <varname>size</varname> | --shared_buffers=<varname>size</varname></pt>
                    <pd>Sets the amount of memory a Greenplum server instance uses for shared memory
                        buffers. You can specify sizing in kilobytes (kB), megabytes (MB) or
                        gigabytes (GB). The default is 125MB.</pd>
                </plentry>
                <plentry>
                    <pt>-s <varname>standby_master_host</varname></pt>
                    <pd>Optional. If you wish to configure a backup master instance, specify the
                        host name using this option. The Greenplum Database software must already be
                        installed and configured on this host.</pd>
                </plentry>
                <plentry>
                    <pt>-P <varname>standby_master_port</varname></pt>
                    <pd>If you configure a standby master instance with <codeph>-s</codeph>, specify
                        its port number using this option. The default port is the same as the
                        master port. To run the standby and master on the same host, you must use
                        this option to specify a different port for the standby. The Greenplum
                        Database software must already be installed and configured on the standby
                        host.</pd>
                </plentry>
                <plentry>
                    <pt>-S <varname>standby_master_datadir</varname> |
                            --standby_dir=<varname>standby_master_datadir</varname></pt>
                    <pd>If you configure a standby master host with <codeph>-s</codeph>, use this
                        option to specify its data directory. If you configure a standby on the same
                        host as the master instance, the master and standby must have separate data
                        directories.</pd>
                </plentry>
                <plentry>
                    <pt> -e <varname>superuser_password</varname> |
                            --su_password=<varname>superuser_password</varname></pt>
                    <pd>Use this option to specify the password to set for the Greenplum Database
                        superuser account (such as <codeph>gpadmin</codeph>). If this option is not
                        specified, the default password <codeph>gparray</codeph> is assigned to the
                        superuser account. You can use the <codeph>ALTER ROLE</codeph> command to
                        change the password at a later time. <p>Recommended security best
                            practices:</p><ul id="ul_vnd_kt2_44">
                            <li id="it148039">Do not use the default password option for production
                                environments.</li>
                            <li id="it148040">Change the password immediately after
                                installation.</li>
                        </ul></pd>
                </plentry>
                <plentry>
                    <pt>--mirror-mode={group|spread}</pt>
                    <pd>Use this option to specify the placement of mirror segment instances on the
                        segment hosts. The default, <codeph>group</codeph>, groups the mirror
                        segments for all of a host's primary segments on a single alternate host.
                            <codeph>spread</codeph> spreads mirror segments for the primary segments
                        on a host across different hosts in the Greenplum Database array. Spreading
                        is only allowed if the number of hosts is greater than the number of segment
                        instances per host. See <xref
                            href="../../admin_guide/highavail/topics/g-overview-of-segment-mirroring.html#topic3" format="html" scope="external"
                        >Overview of Segment Mirroring</xref> for information about Greenplum Database mirroring strategies. </pd>
                </plentry>
                <plentry>
                    <pt>-v | --version</pt>
                    <pd>Print the <codeph>gpinitsystem</codeph> version and exit.</pd>
                </plentry>
                <plentry>
                    <pt>-? | --help</pt>
                    <pd>Show help about <codeph>gpinitsystem</codeph> command line arguments, and
                        exit.</pd>
                </plentry>
            </parml>
        </section>
        <section id="section5">
            <title id="it148051">Initialization Configuration File Format</title>
            <p><codeph>gpinitsystem</codeph> requires a cluster configuration file with the
                following parameters defined. An example initialization configuration file can be
                found in <codeph>$GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config</codeph>.</p>
            <p>To avoid port conflicts between Greenplum Database and other applications, the
                Greenplum Database port numbers should not be in the range specified by the
                operating system parameter <codeph>net.ipv4.ip_local_port_range</codeph>. For
                example, if <codeph>net.ipv4.ip_local_port_range = 10000 65535</codeph>, you could
                set Greenplum Database base port numbers to these values.</p>
            <p>
                <codeblock>PORT_BASE = 6000
MIRROR_PORT_BASE = 7000</codeblock>
            </p>
            <parml>
                <plentry>
                    <pt>ARRAY_NAME</pt>
                    <pd><b>Required.</b> A name for the cluster you are configuring. You can use any
                        name you like. Enclose the name in quotes if the name contains spaces.</pd>
                </plentry>
                <plentry>
                    <pt>MACHINE_LIST_FILE</pt>
                    <pd><b>Optional.</b> Can be used in place of the <codeph>-h</codeph> option.
                        This specifies the file that contains the list of the segment host address
                        names that comprise the Greenplum Database system. The master host is
                        assumed to be the host from which you are running the utility and should not
                        be included in this file. If your segment hosts have multiple network
                        interfaces, then this file would include all addresses for the host. Give
                        the absolute path to the file.</pd>
                </plentry>
                <plentry>
                    <pt>SEG_PREFIX</pt>
                    <pd><b>Required.</b> This specifies a prefix that will be used to name the data
                        directories on the master and segment instances. The naming convention for
                        data directories in a Greenplum Database system is
                            <varname>SEG_PREFIXnumber</varname> where <varname>number</varname>
                        starts with 0 for segment instances (the master is always -1). So for
                        example, if you choose the prefix <codeph>gpseg</codeph>, your master
                        instance data directory would be named <codeph>gpseg-1</codeph>, and the
                        segment instances would be named <codeph>gpseg0</codeph>,
                            <codeph>gpseg1</codeph>, <codeph>gpseg2</codeph>,
                            <codeph>gpseg3</codeph>, and so on.</pd>
                </plentry>
                <plentry>
                    <pt id="port_base">PORT_BASE</pt>
                    <pd><b>Required.</b> This specifies the base number by which primary segment
                        port numbers are calculated. The first primary segment port on a host is set
                        as <codeph>PORT_BASE</codeph>, and then incremented by one for each
                        additional primary segment on that host. Valid values range from 1 through
                        65535.</pd>
                </plentry>
                <plentry>
                    <pt id="data_directory">DATA_DIRECTORY</pt>
                    <pd><b>Required.</b> This specifies the data storage location(s) where the
                        utility will create the primary segment data directories. The number of
                        locations in the list dictate the number of primary segments that will get
                        created per physical host (if multiple addresses for a host are listed in
                        the host file, the number of segments will be spread evenly across the
                        specified interface addresses). It is OK to list the same data storage area
                        multiple times if you want your data directories created in the same
                        location. The user who runs <codeph>gpinitsystem</codeph> (for example, the
                            <codeph>gpadmin</codeph> user) must have permission to write to these
                        directories. For example, this will create six primary segments per
                        host:</pd>
                    <pd>
                        <codeblock>declare -a DATA_DIRECTORY=(/data1/primary /data1/primary 
/data1/primary /data2/primary /data2/primary /data2/primary)</codeblock>
                    </pd>
                </plentry>
                <plentry>
                    <pt>MASTER_HOSTNAME</pt>
                    <pd><b>Required.</b> The host name of the master instance. This host name must
                        exactly match the configured host name of the machine (run the
                            <codeph>hostname</codeph> command to determine the correct
                        hostname).</pd>
                </plentry>
                <plentry>
                    <pt>MASTER_DIRECTORY</pt>
                    <pd><b>Required.</b> This specifies the location where the data directory will
                        be created on the master host. You must make sure that the user who runs
                            <codeph>gpinitsystem</codeph> (for example, the <codeph>gpadmin</codeph>
                        user) has permissions to write to this directory.</pd>
                </plentry>
                <plentry>
                    <pt>MASTER_PORT</pt>
                    <pd><b>Required.</b> The port number for the master instance. This is the port
                        number that users and client connections will use when accessing the
                        Greenplum Database system.</pd>
                </plentry>
                <plentry>
                    <pt>TRUSTED_SHELL</pt>
                    <pd><b>Required.</b> The shell the <codeph>gpinitsystem</codeph> utility uses to
                        run commands on remote hosts. Allowed values are <codeph>ssh</codeph>.
                        You must set up your trusted host environment before running the
                            <codeph>gpinitsystem</codeph> utility (you can use
                            <codeph>gpssh-exkeys</codeph> to do this).</pd>
                </plentry>
                <plentry>
                    <pt>CHECK_POINT_SEGMENTS</pt>
                    <pd><b>Required.</b> Maximum distance between automatic write ahead log (WAL)
                        checkpoints, in log file segments (each segment is normally 16 megabytes).
                        This will set the <codeph>checkpoint_segments</codeph> parameter in the
                            <codeph>postgresql.conf</codeph> file for each segment instance in the
                        Greenplum Database system.</pd>
                </plentry>
                <plentry>
                    <pt>ENCODING</pt>
                    <pd><b>Required.</b> The character set encoding to use. This character set must
                        be compatible with the <codeph>--locale</codeph> settings used, especially
                            <codeph>--lc-collate</codeph> and <codeph>--lc-ctype</codeph>. Greenplum
                        Database supports the same character sets as PostgreSQL.</pd>
                </plentry>
                <plentry>
                    <pt>DATABASE_NAME</pt>
                    <pd><b>Optional.</b> The name of a Greenplum Database database to create after
                        the system is initialized. You can always create a database later using the
                            <codeph>CREATE DATABASE</codeph> command or the
                            <codeph>createdb</codeph> utility.</pd>
                </plentry>
                <plentry>
                    <pt id="mirror_port_base">MIRROR_PORT_BASE</pt>
                    <pd><b>Optional.</b> This specifies the base number by which mirror segment port
                        numbers are calculated. The first mirror segment port on a host is set as
                            <codeph>MIRROR_PORT_BASE</codeph>, and then incremented by one for each
                        additional mirror segment on that host. Valid values range from 1 through
                        65535 and cannot conflict with the ports calculated by
                            <codeph>PORT_BASE</codeph>.</pd>
                </plentry>
                <plentry>
                    <pt>MIRROR_DATA_DIRECTORY</pt>
                    <pd><b>Optional.</b> This specifies the data storage location(s) where the
                        utility will create the mirror segment data directories. There must be the
                        same number of data directories declared for mirror segment instances as for
                        primary segment instances (see the <codeph>DATA_DIRECTORY</codeph>
                        parameter). The user who runs <codeph>gpinitsystem</codeph> (for example,
                        the <codeph>gpadmin</codeph> user) must have permission to write to these
                        directories. For example:</pd>
                    <pd>
                        <codeblock>declare -a MIRROR_DATA_DIRECTORY=(/data1/mirror 
/data1/mirror /data1/mirror /data2/mirror /data2/mirror 
/data2/mirror)</codeblock>
                    </pd>
                </plentry>
                <plentry id="array_params">
                    <pt>QD_PRIMARY_ARRAY, PRIMARY_ARRAY, MIRROR_ARRAY</pt>
                    <pd><b>Required</b> when using <codeph>input_configuration file</codeph> with
                            <codeph>-I</codeph> option. These parameters specify the Greenplum
                        Database master host, the primary segment, and the mirror segment hosts
                        respectively. During new cluster initialization, use the
                            <codeph>gpinitsystem</codeph>
                        <codeph>-O <varname>output_configuration_file</varname></codeph> to populate
                            <codeph>QD_PRIMARY_ARRAY</codeph>, <codeph>PRIMARY_ARRAY</codeph>,
                            <codeph>MIRROR_ARRAY</codeph>. </pd>
                    <pd>To initialize a new cluster or re-create a cluster from a backed up
                        configuration, edit these values in the input configuration file used with
                        the <codeph>gpinitsystem</codeph>
                        <codeph>-I <varname>input_configuration_file</varname></codeph> option. Use
                        one of the following formats to specify the host
                        information:<codeblock>&lt;hostname>~&lt;address>~&lt;port>~&lt;data_directory>/&lt;seg_prefix&lt;segment_id>~&lt;dbid>~&lt;content_id></codeblock>or
                        <codeblock>&lt;host>~&lt;port>~&lt;data_directory>/&lt;seg_prefix&lt;segment_id>~&lt;dbid>~&lt;content_id></codeblock></pd>
                    <pd>The first format populates the <codeph>hostname</codeph> and
                            <codeph>address</codeph> fields in the
                            <codeph>gp_segment_configuration</codeph> catalog table with the
                            <varname>hostname</varname> and <varname>address</varname> values
                        provided in the input configuration file. The second format populates
                            <codeph>hostname</codeph> and <codeph>address</codeph> fields with the
                        same value, derived from <varname>host</varname>.</pd>
                    <pd>The Greenplum Database master always uses the value -1 for the segment
                        ID and content ID. For example, <varname>seg_prefix&lt;segment_id></varname>
                        and <varname>dbid</varname> values for <codeph>QD_PRIMARY_ARRAY</codeph> use
                            <codeph>-1</codeph> to indicate the master
                        instance:<codeblock>QD_PRIMARY_ARRAY=cdw~cdw~5432~/gpdata/master/gpseg-1~1~-1
declare -a PRIMARY_ARRAY=(
sdw1~sdw1~40000~/gpdata/data1/gpseg0~2~0
sdw1~sdw1~40001~/gpdata/data2/gpseg1~3~1
sdw2~sdw2~40000~/gpdata/data1/gpseg2~4~2
sdw2~sdw2~40001~/gpdata/data2/gpseg3~5~3
)
declare -a MIRROR_ARRAY=(
sdw2~sdw2~50000~/gpdata/mirror1/gpseg0~6~0
sdw2~sdw2~50001~/gpdata/mirror2/gpseg1~7~1
sdw1~sdw1~50000~/gpdata/mirror1/gpseg2~8~2
sdw1~sdw1~50001~/gpdata/mirror2/gpseg3~9~3
)</codeblock></pd>
                    <pd>To re-create a cluster using a known Greenplum Database system
                        configuration, you can edit the segment and content IDs to match the values
                        of the system.</pd>
                </plentry>
                <plentry>
                    <pt>HEAP_CHECKSUM</pt>
                    <pd><b>Optional.</b> This parameter specifies if checksums are enabled for heap
                        data. When enabled, checksums are calculated for heap storage in all
                        databases, enabling Greenplum Database to detect corruption in the I/O
                        system. This option is set when the system is initialized and cannot be
                        changed later.</pd>
                    <pd>The <codeph>HEAP_CHECKSUM</codeph> option is on by default and turning it
                        off is strongly discouraged. If you set this option to off, data corruption
                        in storage can go undetected and make recovery much more difficult.</pd>
                    <pd>To determine if heap checksums are enabled in a Greenplum Database system,
                        you can query the <codeph>data_checksums</codeph> server configuration
                        parameter with the <codeph>gpconfig</codeph> management
                        utility:<codeblock>$ gpconfig -s data_checksums</codeblock></pd>
                </plentry>
                <plentry id="hba_hostnames">
                    <pt>HBA_HOSTNAMES</pt>
                    <pd><b>Optional.</b> This parameter controls whether
                            <codeph>gpinitsystem</codeph> uses IP addresses or host names in the
                            <codeph>pg_hba.conf</codeph> file when updating the file with addresses
                        that can connect to Greenplum Database. The default value is
                            <codeph>0</codeph>, the utility uses IP addresses when updating the
                        file. When initializing a Greenplum Database system, specify
                            <codeph>HBA_HOSTNAMES=1</codeph> to have the utility use host names in
                        the <codeph>pg_hba.conf</codeph> file. </pd>
                    <pd>For information about how Greenplum Database resolves host names in the
                            <codeph>pg_hba.conf</codeph> file, see <xref
                            href="../../admin_guide/client_auth.html#topic1" format="html" scope="external">Configuring Client Authentication</xref>.</pd>
                </plentry>
            </parml>
        </section>
        <section>
            <title>Specifying Hosts using Hostnames or IP Addresses</title>
            <p>When initializing a Greenplum Database system with <codeph>gpinitsystem</codeph>, you
                can specify segment hosts using either hostnames or IP addresses. For example, you
                can use hostnames or IP addresses in the file specified with the <codeph>-h</codeph>
                    option.<ul id="ul_ijf_tvq_bmb">
                    <li>If you specify a hostname, the resolution of the hostname to an IP address
                        should be done locally for security. For example, you should use entries in
                        a local <codeph>/etc/hosts</codeph> file to map a hostname to an IP address.
                        The resolution of a hostname to an IP address should not be performed by an
                        external service such as a public DNS server. You must stop the Greenplum
                        system before you change the mapping of a hostname to a different IP
                        address.</li>
                    <li>If you specify an IP address, the address should not be changed after the
                        initial configuration. When segment mirroring is enabled, replication from
                        the primary to the mirror segment will fail if the IP address changes from
                        the configured value. For this reason, you should use a hostname when
                        initializing a Greenplum Database system unless you have a specific
                        requirement to use IP addresses.</li>
                </ul></p>
            <p>When initializing the Greenplum Database system, <codeph>gpinitsystem</codeph> uses
                the initialization information to populate the <xref
                    href="../../ref_guide/system_catalogs/gp_segment_configuration.xml"
                    >gp_segment_configuration</xref> catalog table and adds hosts to the
                    <codeph>pg_hba.conf</codeph> file. By default, the host IP address is added to
                the file. Specify the <codeph>gpinitsystem</codeph> configuration file parameter
                    <xref href="#topic1/hba_hostnames" format="dita">HBA_HOSTNAMES</xref>=1 to add
                hostnames to the file.</p>
            <p>Greenplum Database uses the <codeph>address</codeph> value of the
                    <codeph>gp_segment_configuration</codeph> catalog table when looking up host
                systems for Greenplum interconnect (internal) communication between the master and
                segment instances and between segment instances, and for other internal
                communication.</p>
        </section>
        <section id="section6">
            <title>Examples</title>
            <p>Initialize a Greenplum Database system by supplying a cluster configuration file and
                a segment host address file, and set up a spread mirroring
                    (<codeph>--mirror-mode=spread</codeph>) configuration:</p>
            <codeblock>$ gpinitsystem -c gpinitsystem_config -h hostfile_gpinitsystem --mirror-mode=spread</codeblock>
            <p>Initialize a Greenplum Database system and set the superuser remote password:</p>
            <codeblock>$ gpinitsystem -c gpinitsystem_config -h hostfile_gpinitsystem --su-password=mypassword</codeblock>
            <p>Initialize a Greenplum Database system with an optional standby master host:</p>
            <codeblock>$ gpinitsystem -c gpinitsystem_config -h hostfile_gpinitsystem -s host09</codeblock>
            <p>Initialize a Greenplum Database system and write the provided configuration to an
                output file, for example <codeph>cluster_init.config</codeph>:</p>
            <codeblock>$ gpinitsystem -c gpinitsystem_config -h hostfile_gpinitsystem -O cluster_init.config</codeblock>
            <p>The output file uses the <codeph>QD_PRIMARY_ARRAY</codeph> and
                    <codeph>PRIMARY_ARRAY</codeph> parameters to define master and segment
                hosts:</p>
            <codeblock>ARRAY_NAME="Greenplum Data Platform"
TRUSTED_SHELL=ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE
SEG_PREFIX=gpseg
HEAP_CHECKSUM=on
HBA_HOSTNAMES=0
QD_PRIMARY_ARRAY=mdw~mwd.local~5433~/data/master1/gpseg-1~1~-1
declare -a PRIMARY_ARRAY=(
mwd~mwd.local~6001~/data/primary1/gpseg0~2~0
)
declare -a MIRROR_ARRAY=(
mwd~mwd.local~7001~/data/mirror1/gpseg0~3~0
)</codeblock>
            <p>Initialize a Greenplum Database using an input configuration file (a file that
                defines the Greenplum Database cluster) using <codeph>QD_PRIMARY_ARRAY</codeph> and
                    <codeph>PRIMARY_ARRAY</codeph> parameters:</p>
            <codeblock>$ gpinitsystem -I cluster_init.config</codeblock>
            <p>The following example uses a host system configured with multiple NICs. If host
                systems are configured with multiple NICs, you can initialize a Greenplum Database
                system to use each NIC as a Greenplum host system. You must ensure that the host
                systems are configured with sufficient resources to support all the segment
                instances being added to the host. Also, if high availability is enabled, you must
                ensure that the Greenplum system configuration supports failover if a host system
                fails. For information about Greenplum Database mirroring schemes, see <xref
                  href="../../best_practices/ha.html#topic_ngz_qf4_tt" format="html" scope="external"/>.</p>
            <p>For this simple master and segment instance configuration, the host system
                    <codeph>gp6m</codeph> is configured with two NICs <codeph>gp6m-1</codeph> and
                    <codeph>gp6m-2</codeph>. In the configuration, the <xref
                    href="#topic1/array_params" format="dita">QD_PRIMARY_ARRAY</xref> parameter
                defines the master segment using <codeph>gp6m-1</codeph>. The <xref
                    href="#topic1/array_params" format="dita">PRIMARY_ARRAY</xref> and <xref
                    href="#topic1/array_params" format="dita">MIRROR_ARRAY</xref> parameters use
                    <codeph>gp6m-2</codeph> to define a primary and mirror segment instance.
                <codeblock>QD_PRIMARY_ARRAY=gp6m~gp6m-1~5432~/data/master/gpseg-1~1~-1
declare -a PRIMARY_ARRAY=(
gp6m~gp6m-2~40000~/data/data1/gpseg0~2~0
gp6s~gp6s~40000~/data/data1/gpseg1~3~1
)
declare -a MIRROR_ARRAY=(
gp6s~gp6s~50000~/data/mirror1/gpseg0~4~0
gp6m~gp6m-2~50000~/data/mirror1/gpseg1~5~1
)</codeblock></p>
        </section>
        <section id="section7"><title>See Also</title><xref href="./gpssh-exkeys.xml#topic1"
                type="topic" format="dita"/>, <xref href="gpdeletesystem.xml#topic1" type="topic"
                format="dita"/>, <xref href="../../install_guide/init_gpdb.xml" type="topic"
                scope="peer">Initializing Greenplum Database</xref>.</section>
    </body>
</topic>
